Search CORE

375 research outputs found

2010 ISCB Overton Prize Awarded to Steven E. Brenner

Author: AG Murzin
BJ Morrison McKay
BP Lewis
Clare Sansom
JE Stajich
JM Chandonia
LF Lareau
SE Brenner
Publication venue: Public Library of Science
Publication date: 01/06/2010
Field of study

Crossref

Directory of Open Access Journals

PubMed Central

An intuitive Python interface for Bioconductor libraries demonstrates the utility of language translators

Author: DG Bobrow
JE Stajich
L Prechelt
Laurent Gautier
MD Robinson
PJ Cock
R Development Core Team
R Knight
RC Gentleman
RCG Holland
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Computer languages can be domain-related, and in the case of multidisciplinary projects, knowledge of several languages will be needed in order to quickly implements ideas. Moreover, each computer language has relative strong points, making some languages better suited than others for a given task to be implemented. The Bioconductor project, based on the R language, has become a reference for the numerical processing and statistical analysis of data coming from high-throughput biological assays, providing a rich selection of methods and algorithms to the research community. At the same time, Python has matured as a rich and reliable language for the agile development of prototypes or final implementations, as well as for handling large data sets. Results The data structures and functions from Bioconductor can be exposed to Python as a regular library. This allows a fully transparent and native use of Bioconductor from Python, without one having to know the R language and with only a small community of <it>translators</it> required to know both. To demonstrate this, we have implemented such Python representations for key infrastructure packages in Bioconductor, letting a Python programmer handle annotation data, microarray data, and next-generation sequencing data. Conclusions Bioconductor is now not solely reserved to R users. Building a Python application using Bioconductor functionality can be done just like if Bioconductor was a Python package. Moreover, similar principles can be applied to other languages and libraries. Our Python package is available at: <url>http://pypi.python.org/pypi/rpy2-bioconductor-extensions/</url></p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Online Research Database In Technology

Ten Simple Rules for Getting Help from Online Scientific Communities

Author: Brandon M. Invergo
Colin S. Gillespie
ES Raymond
Giovanni M. Dall'Olio
Hafid Laayouni
Jacopo Marino
Jaume Bertranpetit
JE Stajich
Kevin L. Keys
Khader Shameer
Lars J. Jensen
M Ash
Melanie I. Stefan
Michael Schubert
PE Bourne
Philip E. Bourne
Pierre Poulain
Robert Sugar
W Miller
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/09/2011
Field of study

The increasing complexity of research requires scientists to work at the intersection of multiple fields and to face problems for which their formal education has not prepared them. For example, biologists with no or little background in programming are now often using complex scripts to handle the results from their experiments; vice versa, programmers wishing to enter the world of bioinformatics must know about biochemistry, genetics, and other fields. In this context, communication tools such as mailing lists, web forums, and online communities acquire increasing importance. These tools permit scientists to quickly contact people skilled in a specialized field. A question posed properly to the right online scientific community can help in solving difficult problems, often faster than screening literature or writing to publication authors. The growth of active online scientific communities, such as those listed in Table S1, demonstrates how these tools are becoming an important source of support for an increasing number of researchers. Nevertheless, making proper use of these resources is not easy. Adhering to the social norms of World Wide Web communication—loosely termed “netiquette”—is both important and non-trivial. In this article, we take inspiration from our experience on Internet-shared scientific knowledge, and from similar documents such as “Asking the Questions the Smart Way” and “Getting Answers”, to provide guidelines and suggestions on how to use online communities to solve scientific problems

Crossref

Directory of Open Access Journals

HAL-Inserm

PubMed Central

Copenhagen University Research Information System

Caltech Authors

Digital.CSIC

Hal-Diderot

A single fungal strain was the unexpected cause of a mass aspergillosis outbreak in the world's largest and only flightless parrot.

Author: Cox MP
Dearden PK
Digby A
Fisher MC
Glare T
Kākāpō Aspergillosis Research Consortium
Perrott J
Rhodes J
Stajich JE
Weir BS
Winter DJ
Publication venue: 'Elsevier BV'
Publication date: 28/10/2022
Field of study

Kākāpō are a critically endangered species of parrots restricted to a few islands off the coast of New Zealand. Kākāpō are very closely monitored, especially during nesting seasons. In 2019, during a highly successful nesting season, an outbreak of aspergillosis affected 21 individuals and led to the deaths of 9, leaving a population of only 211 kākāpō. In monitoring this outbreak, cultures of aspergillus were grown, and genome sequenced. These sequences demonstrate that, very unusually for an aspergillus outbreak, a single strain of aspergillus caused the outbreak. This strain was found on two islands, but only one had an outbreak of aspergillosis; indicating that the strain was necessary, but not sufficient, to cause disease. Our analysis provides an understanding of the 2019 outbreak and provides potential ways to manage such events in the future

PubMed Central

eScholarship - University of California

Spiral - Imperial College Digital Repository

TaxMan: a taxonomic database manager

Author: A Rokas
C Lee
D Gordon
DA Benson
H Philippe
JD Thompson
JE Stajich
M Jones
Mark Blaxter
Martin Jones
PC Feijao
SA Olson
SF Altschul
W Ludwig
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Phylogenetic analysis of large, multiple-gene datasets, assembled from public sequence databases, is rapidly becoming a popular way to approach difficult phylogenetic problems. Supermatrices (concatenated multiple sequence alignments of multiple genes) can yield more phylogenetic signal than individual genes. However, manually assembling such datasets for a large taxonomic group is time-consuming and error-prone. Additionally, sequence curation, alignment and assessment of the results of phylogenetic analysis are made particularly difficult by the potential for a given gene in a given species to be unrepresented, or to be represented by multiple or partial sequences. We have developed a software package, TaxMan, that largely automates the processes of sequence acquisition, consensus building, alignment and taxon selection to facilitate this type of phylogenetic study. RESULTS: TaxMan uses freely available tools to allow rapid assembly, storage and analysis of large, aligned DNA and protein sequence datasets for user-defined sets of species and genes. The user provides GenBank format files and a list of gene names and synonyms for the loci to analyse. Sequences are extracted from the GenBank files on the basis of annotation and sequence similarity. Consensus sequences are built automatically. Alignment is carried out (where possible, at the protein level) and aligned sequences are stored in a database. TaxMan can automatically determine the best subset of taxa to examine phylogeny at a given taxonomic level. By using the stored aligned sequences, large concatenated multiple sequence alignments can be generated rapidly for a subset and output in analysis-ready file formats. Trees resulting from phylogenetic analysis can be stored and compared with a reference taxonomy. CONCLUSION: TaxMan allows rapid automated assembly of a multigene datasets of aligned sequences for large taxonomic groups. By extracting sequences on the basis of both annotation and BLAST similarity, it ensures that all available sequence data can be brought to bear on a phylogenetic problem, but remains fast enough to cope with many thousands of records. By automatically assisting in the selection of the best subset of taxa to address a particular phylogenetic problem, TaxMan greatly speeds up the process of generating multiple sequence alignments for phylogenetic analysis. Our results indicate that an automated phylogenetic workbench can be a useful tool when correctly guided by user knowledge

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Edinburgh Research Explorer

PseudoGeneQuest – Service for identification of different pseudogene types in the human genome

Author: A Barak
A Khelifi
AJ Mighell
C Ortutay
C Ortutay
Csaba Ortutay
E Vargas-Madrazo
EF Vanin
ES Balakirev
H Arakawa
JE Karro
JE Stajich
Mauno Vihinen
P Flicek
PM Harrison
S McGinnis
WR Pearson
Z Zhang
Z Zhang
Z Zhang
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Pseudogenes, nonfunctional copies of genes, evolve fast due the lack of evolutionary pressures and thus appear in several different forms. PseudoGeneQuest is an online tool to search the human genome for a given query sequence and to identify different types of pseudogenes as well as novel genes and gene fragments. Description The service can detect pseudogenes, that have arisen either by retrotransposition or segmental genome duplication, many of which are not listed in the public pseudogene databases. The service has a user-friendly web interface and uses a powerful computer cluster in order to perform parallel searches and provide relatively fast runtimes despite exhaustive database searches and analyses. Conclusion PseudoGeneQuest is a versatile tool for detecting novel pseudogene candidates from the human genome. The service searches human genome sequences for five types of pseudogenes and provides an output that allows easy further analysis of observations. In addition to the result file the system provides visualization of the results linked to Ensembl Genome Browser. PseudoGeneQuest service is freely available.</p

Lund University Publications

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

TamPub Julkaisuarkisto - TamPub Institutional Repository

Trepo - Institutional Repository of Tampere University

Distribution of Introns in Fungal Histone Genes

Author: A Ehinger
Choong-Soo Yun
EE Kuramae
GS May
H Nishida
H Nishida
H Nishida
H Nishida
H Wang
Hiromi Nishida
Jason Stajich
JE Stajich
K Luger
K Tamura
LP Woudt
MA Osley
ME Chabouté
ME Harris
T Igo-Kemenes
TH Thatcher
WF Marzluff
WF Marzluff
WR Pearson
Y Liu
Z Xu
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Saccharomycotina and Taphrinomycotina lack intron in their histone genes, except for an intron in one of histone H4 genes of Yarrowia lipolytica. On the other hand, Basidiomycota and Perizomycotina have introns in their histone genes. We compared the distributions of 81, 47, 79, and 98 introns in the fungal histone H2A, H2B, H3, and H4 genes, respectively. Based on the multiple alignments of the amino acid sequences of histones, we identified 19, 13, 31, and 22 intron insertion sites in the histone H2A, H2B, H3, and H4 genes, respectively. Surprisingly only one hot spot of introns in the histone H2A gene is shared between Basidiomycota and Perizomycotina, suggesting that most of introns of Basidiomycota and Perizomycotina were acquired independently. Our findings suggest that the common ancestor of Ascomycota and Basidiomycota maybe had a few introns in the histone genes. In the course of fungal evolution, Saccharomycotina and Taphrinomycotina lost the histone introns; Basidiomycota and Perizomycotina acquired other introns independently. In addition, most of the introns have sequence similarity among introns of phylogenetically close species, strongly suggesting that horizontal intron transfer events between phylogenetically distant species have not occurred recently in the fungal histone genes

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Expedited batch processing and analysis of transposon insertions

Author: A Bohne
AL Price
BC Meyers
CM Bergman
CM Bergman
David A Ray
ES Lander
GD Schuler
JE Stajich
Jeremy D Smith
PL Deininger
RC Edgar
RC Edgar
RH Waterston
Sela
SF Altschul
Publication venue: BioMed Central
Publication date: 01/11/2011
Field of study

Abstract Background With advances in sequencing technology, greater and greater amounts of eukaryotic genome data are becoming available. Often, large portions of these genomes consist of transposable elements, frequently accounting for 50% or more in vertebrates. Each transposable element family may have thousands or tens of thousands of individual copies within a given genome, and therefore it can take an exorbitant amount of time and effort to process data in a meaningful fashion. Findings In order to combat this problem, we developed a set of bioinformatics techniques and programs to streamline the analysis. This includes a unique Perl script which automates the process of taking BLAST, Repeatmasker and similar data to extract and manipulate the hit sequences from the genome. This script, called Process_hits uses an object-oriented methodology to compile all hit locations from a given file for processing, organize this data into useable categories, and output it in multiple formats. Conclusions The program proved capable of handling large amounts of transposon data in an efficient fashion. It is equipped with a number of useful sub-functions, each of which is contained within its own sub-module to allow for greater expandability and as a foundation for future program design.</p

Crossref

Directory of Open Access Journals

PubMed Central